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Abstract 

We obtain moment and Gaussian bounds for general coordinate- 
wise Lipschitz functions evaluated along the sample path of a Markov 
chain. We treat Markov chains on general (possibly unbounded) state 
spaces via a coupling method. If the first moment of the coupling 
time exists, then we obtain a variance inequality. If a moment of order 
1 + e of the coupling time exists, then depending on the behavior of 
the stationary distribution, we obtain higher moment bounds. This 
immediately implies polynomial concentration inequalities. In the case 
that a moment of order 1 + e is finite uniformly in the starting point of 
the coupling, we obtain a Gaussian bound. We illustrate the general 
results with house of cards processes, in which both uniform and non- 
uniform behavior of moments of the coupling time can occur. 

Keywords: Gaussian bound, moment bounds, house of cards process, 
Hamming distance. 

1 Introduction 

In this paper we consider a stationary Markov chain X„, n G Z, and want to 
obtain inequalities for the probability that a function f{Xi, . . . , X„) deviates 
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from its expectation. In the spirit of concentration inequalities, one can try 
to bound the exponential moment of / — IE(/) in terms of the sum of squares 
of the Lipschitz constants of /, as can be done in the case of independent 
random variables by several methods jl7j . 

In the present paper, we want to continue the line of thought developed 
in [3 [S] where concentration inequalities are obtained via a combination of 
martingale difference approach (telescoping / — IE(/)) and coupling of con- 
ditional distributions. In the case of an unbounded state space, we cannot 
expect to find a coupling of which the tail of the distribution of the coupling 
time can be controlled uniformly in the starting points. This non-uniform 
dependence is thus rather the rule than the exception and has to be dealt 
with if one wants to go beyond the finite (or compact) state space situa- 
tion. Moreover, if the state space is continuous, then in general two copies 
of the process cannot be coupled such that they eventually coincide: we ex- 
pect rather that in a coupling the distance between the two copies can be 
controlled and becomes small when we go further in time. We show that 
a control of the distance suffices to obtain concentration inequalities. This 
leads to a "generalized coupling time" which in discrete settings coincides 
with the ordinary coupling time (in the case of a successful coupling). 

In order to situate our results in the existing literature, we want to stress 
that the main message of this paper is the connection between the behavior 
of the generalized coupling time and concentration inequalities. In order to 
illustrate the possibly non-uniform behavior of the coupling time, we concen- 
trate on the simplest possible example of "house of cards" processes (Markov 
chains on the natural numbers). In this paper we restrict to the Gaussian 
concentration inequality and moment inequalities. In principle, moment in- 
equalities with controll on the constants can be "summarized" in the form 
of Orlicz-norm inequalities, but we do not want to deal with this here. 

The case of Markov chains was first considered by Marton [20l [21] : for 
uniformly contracting Markov chains, in particular for ergodic Markov chains 
with finite state space, Gaussian concentration inequalities are obtained. The 
method developed in that paper is based on transportation cost-information 
inequalities. With the same technique, more general processes were consid- 
ered by her in [22]. Later, Samson [25] obtained Gaussian concentration 
inequalities for some classes of Markov chains and $-mixing processes, by 
following Marton's approach. Let us also mention the work by Djellout et al. 
[9] for further results in that direction. Chatterjee [6] introduced a version of 
Stein's method of exchangeable pairs to prove Gaussian as well as moment 
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concentration inequalities. Notice that moment inequalities were obtained for 
Lipschitz functions of independent random variables in [3] . Using martingale 
differences, Gaussian concentration inequalities were obtained in [151 121] for 
some classes of mixing processes. Markov contraction was used in [16] for 
"Markov-type" processes {e.g.. hidden Markov chains). 

Related work to ours is found in [TUl [HI [12] where deviation or concen- 
tration inequalities [TD] and speed of convergence to the stationary measure 
[TTl [T2] are obtained for subgeometric Markov chains, using a technique of 
regeneration times and Lyapounov functions. Concentration properties of 
suprema of additive functional of Markov chains are studied in [1] , using a 
technique of regeneration times. The example of the house of cards process, 
and in particular its speed of relaxation to the stationary measure is studied 
in [11], section 3.1. The speed of relaxation to the stationary measure is of 
course related to the coupling time, see e.g.. [23] for a nice recent account. 
In fact, using an explicit coupling, we obtain concentration inequalities in 
the different regimes of relaxation studied in [TT] . 

Our paper is organized as follows. We start by defining the context and in- 
troduce the telescoping procedure, combined with coupling. Here the notion 
of coupling matrix is introduced. In terms of this matrix we can (pointwise) 
bound the individual terms in the telescopic sum for / — E(/). We then turn 
to the Markov case, where there is a further simplification in the coupling 
matrix due to the Markov property of the coupling. In Section \5\ we prove 
a variance bound under the assumption that the first moment of the (gen- 
eralized) coupling time exists. In section [6] we turn to moment inequalities. 
In this case we require that a moment of order 1 + e of the (generalized) 
coupling time exists. This moment M^^y^i+e depends on the starting point of 
the coupling. The moment inequality for moments of order 2p will then be 
valid if (roughly speaking) the 2p-th moment of M;j.,y,i+t exists. In Section [7] 
we prove that if a moment of order 1 + e of the coupling is finite, uniformly 
in the starting point, then we have a Gaussian concentration bound. 

Finally, Section [8] contains examples. In particular, we illustrate our 
approach in the context of so-called house of cards processes, in which both 
the situation of uniform case (Gaussian bound), as well as the non-uniform 
case (all moments or moments up to a certain order) are met. We end with 
application of our moment bounds to measure concentration of Hamming 
neighborhoods and get non-Gaussian measure concentration bounds. 
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2 Setting 
2.1 The process 

The state space of our process is denoted by E. It is supposed to be a metric 
space with distance d. Elements of E are denoted by x, y, z. E is going to 
serve as state space of a double sided stationary process. Realizations of this 
process are thus elements of E'^ and are denoted by x, y, z. 

We denote by {Xn)n£Z a (two-sided) stationary process with values in E. 
The joint distribution of is denoted by P, and E denotes correspond- 

ing expectation. 

'^-oc denotes the sigma-fields generated by {X^ : k < i}, 

^—oo ■— ^^-OO 
i 

denotes the tail sigma-field, and 




We assume in the whole of this paper that P is tail trivial, i.e., for all sets 
A G ^_oo, P(^) G {0, 1}. 

For i < G Z, we denote by X- the vector . . . ,Xj), and 

similarly we have the notation Xi^, X°°. Elements of ijl^'^+i' --'-?} (z.e., real- 
izations of X/) are denoted by xl, and similarly we have x'_^, x°° . 



2.2 Conditional distributions, Lipschitz functions 

We denote by the joint distribution of {Xj : j > i + 1} given Xi^o = 

x^-oo- We assume that this object is defined for all x^_^, i.e., that there exists 
a specification with which P is consistent. This is automatically satisfied in 
our setting, see [HJ. 

Further, P^i ,,i denotes a coupling of P„i and . 

For / : E'^ —>■ M, we define the i-th Lipschitz constant 

Oiif) ■■= sup <^ — — : Xj = yj, Wj t h Xi^yiV . 

(, (lyXi^yi) ) 
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The function / is said to be Lipschitz in the i-th coordinate if < 
oo, and Lipschitz in all coordinates if Si{f) < oo for all i. We use the 
notation 6{f) = We denote by Lip(i?^,R) the set of all real- 

valued functions on which are Lipschitz in all coordinates. 



3 Telescoping and the coupling matrix 

We start with / G Lip{E^, M.)r\L^{F), and begin with the classical telescoping 
(martingale-difference) identity 

oo 

/-E(/)= A, 

j=— oo 

where 

A, :=E(/|^l^)-E(/|^roo^). 
We then write, using the notation of Section \'2.1\ 



^r+i) [/(^-ooCi) - m-^^n] ■ (i) 

For / G Lip(£'^,]R), we have the following obvious telescopic inequality 

\f{x)-f{y)\<J2^^ifMx^,y^). (2) 



Combining ([T]) and ([2]) one obtains 



oo 



where 



ia.(^-oo)i<E^m;7'^«+.(/) (3) 

j=0 



This is an upper-triangular random matrix which we call the coupling matrix 
associated with the process {Xn) and P, the coupling of the conditional distri- 
butions. As we obtained before in [7], in the context of a finite set, the de- 

cay properties of the matrix elements [i.e., how these matrix elements 



become small when j becomes large) determine the concentration properties 
of / G Lip(£'^,M), via the control ([3]) on Aj, together with Burkholder's 
inequality ^ Theorem 3.1, p. 87], which relates the moments of / — IE(/) 
with powers of the sum of squares of A,. The non-uniformity (as a function 
of the realization of X^_^) of the decay of the matrix elements as a function 
of j (which we encountered e.g.. in the low-temperature Ising model [7]) will 
be typical as soon as the state space E is unbounded. Indeed, if starting 
points in the coupling are further away, then it takes more time to get the 
copies close in the coupling . 

Remark 3.1. The same telescoping procedure can be obtained for "coordinate- 
wise Holder" functions, i.e., functions such that for some < a < 1 

raff-, //(^) - f{y) w / ■ / 1 

6i if) := sup <^ — — — : Xj = Vj, Vj ^ Xiy^yi'>. 

is finite for all i. In (jll), we then have to replace d by d". 

4 The Markov case 

We now consider {Xn)nez to be a stationary and ergodic Markov chain. We 
denote by p{x, dy) := P(Xi G dy\XQ = x) the transition kernel. We let u be 
the unique stationary measure of the Markov chain. We denote by F^, the 
path space measure of the stationary process {Xn)nez- By P^, we denote the 
distribution of {X^), for the Markov process conditioned on Xo = x. 

We further suppose that the coupling P of Section 12.21 is Markovian, and 
denote by P^.^^ the coupling started from x, y, and corresponding expectation 
by Kx^y. More precisely, by the Markov property of the coupling we then 
have that 

is a Markovian coupling {{Xn\ Xn^))n^-^ of the Markov chains (X„)„>o start- 
ing from Xq = Xi, resp. Yq = Z/j- In this case the expression (jl]) of the coupling 
matrix simplifies to 

^x,_„x,(j) := A^V"]'''' = / PiX.~udy) [ dFx^,,iu^,v^)diu,,v,) 
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With this notation, reads 

|A,(X,_i,X,)| <J2'^x.^„xAM-,jf. (5) 

We define the "generahzed couphng time" 

oo 

T{u^,v^):=J2d{uj,v,). (6) 

i=o 

In the case is a discrete (finite or countable) alphabet, the "classical" 
coupling time is defined as usual 

T{u'^,v^) := inf{fc > : Vj > : u^- = vj}. 

If we use the trivial distance d{x, y) = 1 if x y and d{x, y) = if x = y, 
for x,y & E, then we have 

diu„v,)<l{Tiu^,v^)>j} (7) 

and hence 

Of course, the same inequality remains true if is a bounded metric space 
with d{x,y) < 1 for x,y G E. However a "successful coupling" {i.e., a 
coupling with T < oo) is not expected to exist in general in the case of a 
non-discrete state space. It can however exist, see e.g.. [13] for a successful 
coupling in the context of Zhang's model of self-organized criticality. Let us 
also mention that the "generalized coupling time" unavoidably appears in 
the context of dynamical systems [S]. 

In the discrete case, using ([5]) and ([7]), we obtain the following inequality: 

^.Aj) < il{Tiu^,v^) > j}) (8) 

z 

whereas in the general (not necessarily discrete) case we have, by ([6]), and 
monotone convergence. 
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Remark 4.1. So far, we made a telescoping of f — K{f ) using an increasing 
family of sigma- fields. One can as well consider a decreasing family of sigma- 
fields, such as , defined to be the sigma-fields generated by {X^ : k > 
i}. We then have, mutatis mutandis, the same inequalities using "backward 
telescoping" 

oo 

/ - E(/) = A*, 

i=—oo 

where 

A* :=E(/|^-)-E(/|^-). 

and estimating A* in a completely parallel way, by introducing a lower- 
triangular analogue of the coupling matrix matrix. 

Backward telescoping is natural in the context of dynamical systems where 
the forward process is deterministic, hence cannot be coupled (as defined 
above) with two different initial conditions such that the copies become closer 
and closer. However, backwards in time, such processes are non-trivial Markov 
chains for which a coupling can be possible with good decay properties of the 
coupling matrix. See for a concrete example with piecewise expanding 
maps of the interval. 

5 Variance inequality 

For a real-valued sequence (ai)igz, we denote the usual £p-norm by 



i/p 



\a\\p 



E 



Our first result concerns the variance of a / G Lip(£'^,M). 
Theorem 5.1. Let f e Up{E^,R) f] L^{F^). Then 

Var(/)<C||5(/)1|^ (10) 

where 

C = u{dx)x 



j p{x,dz) j p{x,dy) j p{x,du)^^^y{T)^^^u{T). (11) 
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As a consequence, we have the concentration inequality 

Vt>0, P(|/-E(/)| >t) <C (12) 



Proof. We estimate, using and stationarity 

\j>0 J 

where * denotes convolution, and where we extended ^ to Z by putting it 
equal to zero for negative integers. Since 



oo 

Var(/)= 5^ e((A,)^ 



Using Young's inequality, we then obtain. 



Var(/)<E(||^Xo,x,*5(/)||') <e(||v^Xoa||?) \\S{f)\ 



Now, using the equality in 



e(^J piXo,dy)Ex„yir] 



2 



2 



iy{dx)p{x,dz) yj p{x,dy)E^^y{T) 
iy{dx) / p{x,dz) / p{x,dy) / p{x, du)E^^y{T)E^^u{T), 



which is fllOj) . Inequality (1121) follows from Chebychev's inequality. □ 

The expectation in ffTTj) can be interpreted as follows. We start from a 
point X drawn from the stationary distribution and generate three indepen- 
dent copies y, u, z from the Markov chain at time t = 1 started from x. With 
these initial points we start the coupling in couples {y,z) and {u,z), and 
compute the expected coupling time. 
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6 Moment inequalities 

In order to control higher moments of (/ — IE(/)), we have to tackle higher 
moments of the sum ^ • A? and for these we cannot use the simple station- 
arity argument used in the estimation of the variance. 

Instead, we start again from (JSj) and let A^(j) := (j + 1)^^*^ where e > 0. 

We then obtain, using Cauchy-Schwarz inequality: 



j>0 

2\ 1/2 



< 



vj>0 fc>0 



m 



Hence 



A,2<M/^(X,_i,X,)((5(/))%-) (13) 



1 

2 



where 5(/)^ denotes the sequence with components (5i(/)) , and where 

^^(X,_i,X,) = E(j + 1)1+^ {^x...,xAj)Y ■ (14) 
i>o 

Moment inequalities will now be expressed in terms of moments of 

6.1 Moment inequalities in the discrete case 

We first deal with a discrete state space E. Recall ([7]). 

Lemma 6.1. In the discrete case, i.e., if E is a countable set with the discrete 
metric, then, for all e > 0, we have the estimate 

^2(x,_i,X,) 

^ I ^)^xUiT + 1)^+^) j . (15) 

Proof. Start with 

^ E + ^)p(^.-i, u)FxUT > j)Px.,n(T > j). 

z,u j>0 
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Proceed now with 



j>0 

oo oo lAk 

k=0 1=0 j=0 
oo oo 



k=0 1=0 

= E(((ri + i)A(r2 + i))2+^), 



where we denoted by Ti and T2 two independent couphng times correspond- 
ing to two independent copies of the couphng started from {Xi,z), resp. 

Now use that for two independent non-negative real- valued random vari- 
ables we have 

E{{XAYY+') <E(Xi+5)E(r^+5). 

The lemma is proved. □ 

In order to arrive at moment estimates, wc want an estimate for E( Af)''. 
This is the content of the next lemma. We denote, as usual, ({s) = Yl'^=i{^ / nY ■ 

Lemma 6.2. For all e > and integers p > we have 

■' ^ ' x,y 

^EK^,^)E.,.((r+l)^+t))) . (16) 

Proof. We start from 

e(EaD^ 

i 

ii,...,ip \i=i J 1=1 ^ 
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Then use Holder's inequality and stationarity, to obtain 

i 

1 



A2 

1 

A2 



p 
1 

2 IIP 



\mf))X 



where in the second inequality we used Young's inequality. The lemma now 
follows from f|T5l) . □ 

We can now formulate our moment estimates in the discrete case. 

Theorem 6.1. Suppose E is a countable set with discrete metric. Let p > 1 
be an integer and f G Lip(i?^,]R) fl L'^'p{F). Then for all e > we have the 
estimate 



where 



c, = {2p-ir^ 



X 



2p 



x,y \ z 

As a consequence we have the concentration inequalities 
Vt>0, P(|/-E(/)|>t)<C, 



2p 
2 



(17) 



(18) 



(19) 



Proof. By Burkholder's inequality [5^, Theorem 3.1, p. 87], one gets 
E((/-E(/))^^)<(2p-l)^^E((5^A,n^ 



and (jlSll then follows from (]T6i) . whereas (1191) follows from (1181) by Markov's 
inequahty. □ 
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Remark 6.1. Theorem \ 6. 1\ for p = 1 is weaker than Theorem \5.1[ indeed, 
for (11 01) to hold we only need to have the first moment of the coupling time 
to be finite. 

Remark 6.2. A typical behavior (see the examples below) of the coupling 
time is as follows: 

Ky{T>j)<C{x,ymj) 

Here C{x,y) is a constant that depends, in general in an unbounded way, on 
the starting points {x, y) in the coupling and where (pij), determining the tail 
of the coupling time does not depend on the starting points. Therefore, for 
the finiteness of the constant Cp in (1181) we need that the tail- estimate 
decays fast enough so that ^jf<p{j) < oo (that does not depend on p), and 
next the 2p-th power of the constant C{x,y) has to be integrable (this depends 
on p). 



6.2 The general state space case 

In order to formulate the general state space version of these results, we 
introduce the expectation 

^x,y{F{u,v)) = j p{x,dz) j Fy^;,{du,dv)F(u,v). 

We can then rewrite 

vl>2(x,y) = ^(j + 1)1+^ (Ex^ydiu„v,)y . 

j>0 

We introduce 

= (Ex^y{d{uj,Vj)) ~Ex,y{d{uj+i,Vj+i))^ . 

This quantity is the analogue of P(T = j) of the discrete case. We then 
define 

i>o 

which is the analogue of the r-th moment of the coupling time. The analogue 
of Theorem 16.11 then becomes the following. 
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Theorem 6.2. Let p > 1 be an integer and f G Up{E^, R) n L'^p(F). Then 
for all e > we have the estimate 

E{f-E{f)rP<CMf)\\l' 

where 

Cp = {2p - 1)2^ [^^^J J '^{dx)p{x, dy) (Mf^.)'' • 

7 Gaussian concentration bound 

If one has a uniform estimate of the quantity ([HD, we obtain a corresponding 
uniform estimate for A^, and via Hoeffding's inequahty, a Gaussian bound 
for (/ — IE(/)). This is formulated in the following theorem. 

Theorem 7.1. Let E he a countable set with the discrete metric. Let f G 
Up{E^, R) such that exp(/) e ^^(P). Then for all e > we have 

E (e^-^(/)) < e^!l^(^)ll^/i6 (21) 

where ^ 

C = C(l + e) fsupE„,„(Ti+^)') . (22) 

In particular, we get the concentration inequality 

W>0, P(|/-E(/)|>t)<2exp(^p^j. (23) 

The general state space analogue of these bounds is obtained by replacing 
E„,„(Ti+^) by MiY, zn ([22]) (where M^'V zs defined zn (^). 

Proof. /^From f|T5|) we get 

^'(x,2/) < i^p(x,z)p(x,M)E,,,(T^+t)E,,,(T^+t) 



2 



< i (supE,,,(Ti+i) ) 

\ u,v / 



2 
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Now start from the classical Azuma-Hoeffding inequality 
Therefore, we estimate, using f[T51) and Young's inequality 



which establishes fl2T]) . Inequality fl23|) follows from fl2T]) by the optimized 
exponential Chebychev inequality. □ 

Remark 7.1. T/ie assumption that a moment of order 1 + e of the coupling 
time exists, which is uniformly bounded in the starting point, can be weak- 
ened to the same property for the first moment, if we have some form of 
monotonicity. More precisely, we say that a coupling has the monotonicity 
property, if there exist "worse case starting points" Xu,xi, which have the 
property that 

x,y 

for all j > 0. In that case, using we can start from (l5j) and obtain, in 
the discrete case, the uniform bound 

A,<j2h^,^,{T>m+jf 

and via Azuma-Hoeffding inequality, combined with Young's inequality, we 
then obtain the Gaussian bound fl2T]) with 



Finally, it can happen (especially if the state space is unbounded) that the 
coupling has no worst case starting points, but there is a sequence x^,xf 
of elements of the state space such that IPx2,a:j"(^ > j) is a non- decreasing 
sequence in n for every fixed j and 

h,yiT>j)< lim hz,ccfiT>j). 

n— >oo ' 

(E.g.., in the case of the state space Z, we can think of the sequence x" — cx3 
and x" —oo.) In that case, from monotone convergence we have the 
Gaussian concentration bound with 

C = lim -K^n ^n(T). 
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8 Examples 

8.1 Finite-state Markov chains 

As we mentioned in the introduction, this case was aheady considered by 
K. Marton (and others), but it illustrates our method in the most simple 
setting, and gives also an alternative proof in this setting. 

Indeed, if the chain is aperiodic and irreducible, then it is well-known 

sup >j)<c 

u,veE 

for all j > 1 and some c > 0. Hence the Gaussian bound (1211) holds. 

8.2 House of cards processes 

These are Markov chains on the set of natural numbers which are useful 
in the construction of couplings for processes with long-range memory, and 
dynamical systems, see e.g.. [1]. 

More precisely, a house of cards process is a Markov chain on the natural 
numbers with transition probabilities 

P(Xfc+i = n + l\Xk = n) = 1 - g„ = 1 - P(Xfc+i = 0\Xk = n), 

for n = 0, 1, 2, . . ., i.e., the chain can go "up" with one unit or go "down" to 
zero. Here, < < 1. 

In the present paper, house of card chains serve as a nice class of examples 
where we can have moment inequalities up to a certain order, depending 
on the decay of g„, and even Gaussian inequalities. Given a sequence of 
independent uniformly distributed random variables {Uk) on [0,1], we can 
view the process Xk generated via the recursion 

Xk+i = {Xk + l)t{Uk+i > qxj. (24) 

This representation also yields a coupling of the process for different initial 
conditions. The coupling has the property that when the coupled chains 
meet, they stay together forever. In particular, they will stay together forever 
after they hit together zero. For this coupling, we have the following estimate. 
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Lemma 8.1. Consider the coupling defined via fl24l) . started from initial con- 
dition {k,m) with k>m. Then we have 

t-i 

hAT >t)< - qU,) (25) 

j=0 

where 

In = inf qs- 

s<.n 

Proof. Call Yfl^ the process defined by ( !24|) started from k, and define Z^, a 
process started from k defined via the recursion 

Zt+, = iZt + 1)1 {Ut+i>q*z,}, 

where Ut is the same sequence of independent uniformly distributed random 
variables as in (^^. We claim that, for all t > 0, 

Indeed, the inequalities hold at time zero. Suppose they hold at time t, then, 
since g* is non- increasing as a function of n, 

qyk > q*Yk > q*zk and gy™ > gym > q*^k 

whence 

11 [Ut+i > q*zk] > 11 [Ut+i > qyk] and II \Ut+i > g^.} > 11 {Ut+i > qy^} ■ 

Therefore, in this coupling, if Z^ = 0, then Y^"^ = Y^^ = 0, and hence the 
coupling time is dominated by the first visit of Z^ to zero, which gives 

t-i 

h,m{T > t) < ^ 0, n = 1, . . . , t - 1) = J](l - g*+^.)- 

i=i 

□ 

The behavior fl25|) of the coupling time shows the typical non-uniformity 
as a function of the initial condition. More precisely, the estimate in the rhs 
of becomes bad for large k. We now look at three more concrete cases. 
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1. Case 1: 

1 

gn = — , n > 2, < a < 1. 

n° 

Then it is easy to deduce from l^^ij that 

h,m{T >t)< Cexp ("Y^^ {{t + A;)!-" - k'-'^)^ . (26) 

The stationary (probabihty) measure is given by: 

iy{k) = TToCfc (27) 

with 

k 

Ck = - ^j) (28) 

3=0 

which is bounded from above by 

Ck < C'exp ("Y^'^^"") • (29) 

^From ( |26ll . combined with (1271) . ( |29ll . it is then easy to see that the 
constant Cp of f|T8|) is finite for all p G N. Therefore, in that case the 
moment inequalities f|T7j) hold, for all p > 1. 



2. Case 2: 

gn = - (7>0) 
n 

for n > 7 + 1, and other values qi are arbitrary. In this case we obtain 
from (125|) the estimate 

^kmiT >t)< a^^^^^ 

and for the stationary measure we have fl28l) with 
The constant Cp of (JTSl) is therefore bounded by 

Cp < CyC'^ClC2 

18 



where Ci = (2p— l)^P(^(H-e)/2)P is finite independent of 7, and where 

k>l 

SO we estimate 

{k + iy{t + iY 



Efc+i,o(T + l)^+t<(i + 5) J2 



where S := e/2. To see when C2 < 00, we first look at the behavior of 



The sum in the rhs is convergent for b — a > 1, in which case it behaves 
as k^+°'~^ for k large, which gives for our case a = 5, 6 = 7, 7 > 1 + 5. 
In that case, we find that C2{p) is finite as soon as 



A;>1 

which gives 

7 > 1 + 2p(5 + 2p. 

Hence, in this case, for 7 > 1 + 5, we obtain the moment estimates 
fjO) up to order p < (7 - l)/2(5 + 1). 

3. Case 3: 

q := inf{g„ : n G N} > 
then we have the uniform estimate 



supPfc,^(r>t) < (l-g) 

k,m 



which gives the Gaussian concentration bound (!2T|) with C 



1 



2(l-q)- 
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8.3 Ergodic interacting particle systems 

As a final example, we consider spin-flip dynamics in the so-called M < e 
regime. These are Markov processes on the space E = {0, 1}"^, with S a 
countable set. This is a metric space with distance 

oo 
n=l 

where tt, i— i„ is a bijection from N to S. 

The space E is interpreted as set of configurations of "spins" rji which 
can be up (1) or down (0) and are defined on the set 5* (usually taken to be a 
lattice such as Z'^). The spin at site i G S* flips at a configuration dependent 
rate c{i,r]). The process is then defined via its generator on local functions 
defined by 

ies 

where 7]'^ is the configuration rj obtained from t] by flipping at site i. See [18] 
for more details about existence and ergodicity of such processes. 

We assume here that we are in the so-called "M < e regime", where we 
have the existence of a coupling (the so-called "basic coupling") for which 
we have the estimate 

K^Mt) ^ Qit)) < e-^e^^^'^)* (30) 

with a matrix indexed by S with finite £i-norm M < e. As a conse- 

quence, from any initial configuration, the system evolves exponentially fast 
to its unique equilibrium measure which we denote fi. The stationary Markov 
chain is then defined as X„ = rjns where 6 > 0, and rjo = Xq is distributed 
according to fi. 

In the basic coupling, from fl5Ul) . we obtain the uniform estimate 

As a consequence, the quantity M^'^ of fl2Ul) is finite uniformly in rjX, for 
every r > 0. Therefore, we have the Gaussian bound with 

k>l 
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8.4 Measure concentration of Hamming neighborhoods 



We apply Theorem 16 .11 to measure concentration of Hamming neighborhoods. 
The case of contracting Markov chains was aheady (and first) obtained in 
[20] as a consequence of an information divergence inequahty. We can easily 
obtain such Gaussian measure concentration from fl^Tl) . But, by a well- 
known result of Bobkov and Gotze [2j, fl?Il) and that information divergence 
inequahty are in fact equivalent. The interesting situation is when ( |2T|) does 
not hold but only have moment bounds. 

Let A,Bg E"' be two sets and denote by d{A, B) their normalized Ham- 
ming distance, d{A,B) = inf y") : Xi & A,y^ & B}, where 



1 " 
n ^-^ 



d{xi, yi) = 1 if 7^ yi, and otherwise. The e-neighborhood of A is then 

[A], = inf J(x^yn<4- 

Theorem 8.1. Take any n G N and let A C E"" a measurable set with 
P(A) > 0. Then, under the assumptions of Theorem \6.1[ we have, for all 

¥{[A]e) > 1 - — ^ 



(^l/2p v^(P(A))l/2p 



2p 



for all e > 



a 



l/2p 
V 



v/H(P(yl))l/2p ■ 



Proof. We apply Theorem 16.11 to / = d{-,A)., which is a function defined on 
E^. It is easy to check that < 1/n, i = 1, . . . , n. We first estimate IE(/) 
by using f[T7|) . which gives (using the fact that f\A = 0) 

E(/) < 



v/^(P(A))l/2p- 

Now we apply (fTOll with t = e, 



P(/ >e)< 



■'p 



nP ( d'^^ \^P' 



v^(P{A))V2p^ 

The result then easily follows. □ 
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Remark 8.1. Forp = 1, the theorem holds under the assumption of Theorem 
\5.1\ see Remark 

As we saw in Section 18. 2[ we cannot have Gaussian bounds for certain 
house of cards processes, but only moment estimates up to a critical order. In 
particular, this means that we cannot have a Gaussian measure concentration 
of Hamming neighborhoods. But in that case we can apply the previous 
theorem and get polynomial measure concentration. 

Acknowledgment. The authors thank E. Verbitskiy for useful discussions 
on the house of cards process, and an anonymous referee for useful remarks. 
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